Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Abstract Young children with limited knowledge of formal mathematics can intuitively perform basic arithmetic‐like operations over nonsymbolic, approximate representations of quantity. However, the algorithmic rules that guide such nonsymbolic operations are not entirely clear. We asked whether nonsymbolic arithmetic operations have a function‐like structure, like symbolic arithmetic. Children (n =74 4‐ to ‐8‐year‐olds in Experiment 1;n =52 7‐ to 8‐year‐olds in Experiment 2) first solved two nonsymbolic arithmetic problems. We then showed children two unequal sets of objects, and asked children which of the two derived solutions should be added to the smaller of the two sets to make them “about the same.” We hypothesized that, if nonsymbolic arithmetic follows similar function rules to symbolic arithmetic, then children should be able to use the solutions of nonsymbolic computations as inputs into another nonsymbolic problem. Contrary to this hypothesis, we found that children were unable to reliably do so, suggesting that these solutions may not operate as independent representations that can be used inputs into other nonsymbolic computations. These results suggest that nonsymbolic and symbolic arithmetic computations are algorithmically distinct, which may limit the extent to which children can leverage nonsymbolic arithmetic intuitions to acquire formal mathematics knowledge.more » « less
- 
            Natural policy gradient (NPG) methods are among the most widely used policy optimization algorithms in contemporary reinforcement learning. This class of methods is often applied in conjunction with entropy regularization—an algorithmic scheme that encourages exploration—and is closely related to soft policy iteration and trust region policy optimization. Despite the empirical success, the theoretical underpinnings for NPG methods remain limited even for the tabular setting. This paper develops nonasymptotic convergence guarantees for entropy-regularized NPG methods under softmax parameterization, focusing on discounted Markov decision processes (MDPs). Assuming access to exact policy evaluation, we demonstrate that the algorithm converges linearly—even quadratically, once it enters a local region around the optimal policy—when computing optimal value functions of the regularized MDP. Moreover, the algorithm is provably stable vis-à-vis inexactness of policy evaluation. Our convergence results accommodate a wide range of learning rates and shed light upon the role of entropy regularization in enabling fast convergence.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available